Skip to content

.NET: Fix declarative workflow regressions for hosted agents#5905

Open
alliscode wants to merge 6 commits into
microsoft:mainfrom
alliscode:triage-fix
Open

.NET: Fix declarative workflow regressions for hosted agents#5905
alliscode wants to merge 6 commits into
microsoft:mainfrom
alliscode:triage-fix

Conversation

@alliscode
Copy link
Copy Markdown
Member

@alliscode alliscode commented May 15, 2026

Several regressions surfaced when running a declarative workflow as a Foundry hosted agent. Together they caused condition groups to fall through to else Actions, surfaced the raw agent JSON to the caller, silently dropped SendActivity output, and corrupted session state across follow-up turns.

1. AgentProviderExtensions.InvokeAgentAsync forced autoSend: true whenever the agent ran on the workflow conversation, overriding the explicit autoSend: false declared in workflow.yaml and streaming the raw structured-output JSON straight to the user. Honor the caller-supplied autoSend instead.

2. Variable name/namespace resolution failed on request threads. IWorkflowContextExtensions.ReadState / QueueStateUpdateAsync / QueueStateResetAsync and WorkflowDiagnostics.InitializeDefaults took the variable name and namespace alias directly from PropertyPath.VariableName / NamespaceAlias. In Microsoft.Agents.ObjectModel 2026.2.4.1 those properties are evaluated lazily against ProductContext.Current, which is AsyncLocal<T>-scoped. The workflow sets the Foundry product on the build thread via WorkflowFormulaState''s ctor, but when the workflow is hosted (AsAIAgent + AddFoundryResponses) each HTTP request runs on a fresh logical context where that AsyncLocal is in its default state. The parser then returns null for a dotted reference such as Local.Triage (despite SegmentCount == 2 and IsValid == true), so every assignment threw ArgumentNullException via Throw.IfNull, and IsManagedScope returned false so the underlying State.Set call was silently skipped. In-process declarative samples are unaffected because Build and InProcessExecution.RunStreamingAsync share one async chain, so the build-thread ProductContext flows forward through await.

Introduce a PropertyPathExtensions helper that reconstructs the variable name from PropertyPath.Segments() and translates the user-facing Local alias to its canonical Topic form, bypassing the AsyncLocal-dependent parser. Route the affected call sites through the helper. WorkflowFormulaState.Bind continues to expose the namespace as Local to PowerFx, so workflow YAML and expressions are unchanged.

3. SendActivity output was dropped from the streaming response. SendActivityExecutor yielded only an AgentResponseEvent, which is gated off by _includeWorkflowOutputsInResponse = false when the workflow is hosted as an agent (to avoid duplicating the agent''s own autoSend stream). As a result, any clarification or hand-off text emitted via SendActivity never reached the caller. Emit an AgentResponseUpdateEvent alongside the existing event so SendActivity text is surfaced via the normal streaming path.

4. HITL approval state was lost when local AIFunctions ran inside the hosted agent. InvokeFunctionToolExecutor.CaptureResponseAsync only acted on FunctionResultContent. In the hosted Foundry path the approval response arrives as a ToolApprovalResponseContent with no matching FunctionResultContent, so the local AIFunction never ran and downstream PropertyPath/SendActivity consumers (e.g. {Local.RefundResult}) saw empty values. When no FunctionResultContent matches but an approved ToolApprovalResponseContent does, look up the registered AIFunction by name on agentProvider.Functions and invoke it with the evaluated arguments, surfacing the result through the existing assignment path.

AgentFrameworkResponseHandler continues to key the AgentSession store solely on conversation_id. Threading session state via the response-id chain was considered but rejected: a previous_response_id chain can legitimately weave through other agents (agent A → response R1, agent B → response R2 with previous=R1, agent A again with previous=R2), so the only reliable cross-turn anchor for an individual hosted agent is the conversation field the client passes on every request. Clients that need HITL state across turns must include conversation in each request — the in-tree samples and azd ai agent invoke already do.

Verified end-to-end against a deployed Foundry hosted agent: the declarative triage workflow now routes Technical / Billing / General inputs correctly, SendActivity clarifications stream back to the caller, approved local refund functions run and populate workflow variables, and only autoSend-eligible agent messages are forwarded.

Three regressions surfaced when running a declarative workflow as a
Foundry hosted agent. Together they caused every condition group to fall
through to elseActions and the raw agent JSON to leak to the caller.

1. AgentProviderExtensions.InvokeAgentAsync forced autoSend to true
   whenever the agent ran on the workflow conversation, which overrode
   the explicit autoSend: false declared in workflow.yaml and streamed
   the raw structured-output JSON straight to the user. Honor the
   caller-supplied autoSend instead.

2. IWorkflowContextExtensions.ReadState / QueueStateUpdateAsync /
   QueueStateResetAsync took the variable name and namespace alias
   directly from PropertyPath.VariableName / NamespaceAlias. Against
   Microsoft.Agents.ObjectModel 2026.2.4.1 those properties return null
   for a dotted reference such as `Local.Triage` even when
   SegmentCount == 2 and IsValid == true, so every assignment threw
   ArgumentNullException via Throw.IfNull. Fall back to Segments() to
   reconstruct the name and alias when the parser returns null.

3. The same ObjectModel version no longer recognizes the user-facing
   `Local` scope alias: VariableScopeNames.IsValidName(`Local`)
   returns false and GetNamespaceFromName(`Local`) returns Unknown, so
   the declarative interpreter's IsManagedScope check fails and the
   State.Set call is silently skipped. Translate the `Local` alias to
   its canonical `Topic` form before forwarding to
   QueueStateUpdateAsync; WorkflowFormulaState.Bind continues to expose
   it as `Local` to PowerFx.

Verified end-to-end against a deployed Foundry hosted agent: the
declarative triage workflow now routes Technical / Billing / General
inputs correctly and only the autoSend-eligible messages reach the
caller.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 15, 2026 21:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes three regressions that prevented declarative workflows from running correctly as Foundry hosted agents: an autoSend override leaking raw agent JSON, a PropertyPath parsing regression in ObjectModel 2026.2.4.1, and a Local scope alias no longer being recognized.

Changes:

  • Honor caller-supplied autoSend instead of forcing it to true for the workflow conversation.
  • Fall back to Segments() to reconstruct variable name / namespace alias when PropertyPath.VariableName / NamespaceAlias return null.
  • Translate the user-facing Local scope alias to its canonical Topic form before forwarding to state-update APIs.
Show a summary per file
File Description
dotnet/src/Microsoft.Agents.AI.Workflows.Declarative/Extensions/AgentProviderExtensions.cs Removes the unconditional autoSend override for the workflow conversation.
dotnet/src/Microsoft.Agents.AI.Workflows.Declarative/Extensions/IWorkflowContextExtensions.cs Adds GetVariableName / GetNamespaceAlias helpers that compensate for ObjectModel regressions and remap LocalTopic.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 6

@moonbox3 moonbox3 added .NET workflows Related to Workflows in agent-framework labels May 15, 2026
@github-actions github-actions Bot changed the title Fix declarative workflow regressions for hosted agents .NET: Fix declarative workflow regressions for hosted agents May 15, 2026
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 3 | Confidence: 86%

✓ Correctness

The three regression fixes are logically correct. The autoSend |= isWorkflowConversation removal properly honors caller intent, and both isWorkflowConversation and workflowConversationId remain used at line 51. The PropertyPath workaround via GetVariableName/GetNamespaceAlias correctly reconstructs the variable name and alias from Segments(), and the "Local" → VariableScopeNames.Topic translation ensures IsManagedScope (which relies on VariableScopeNames.IsValidName) succeds, allowing State.Set to execute on the managed path. The existing review thread already covers the key improvement areas (readability, null guarding, magic strings, TODO markers, test coverage). I found no new blocking correctness issues beyond those already flaged.

✓ Test Coverage

This PR fixes three declarative workflow regressions but introduces no new tests for the changed behavior. The autoSend behavioral change in AgentProviderExtensions (removing autoSend |= isWorkflowConversation) has zero test coverage — the only test exercising InvokeAgentAsync (Workflows/InvokeAgent.cs:70) hardcodes autoSend = true and never verifies the autoSend: false + workflow-conversation scenario that this fix is meant to correct. The GetVariableName/GetNamespaceAlias helper test gap was already flagged in the prior review at line 46 and is not re-raised here.

✗ Design Approach

The overall direction looks right, but the PropertyPath regression workaround is incomplete: runtime reads/writes now reconstruct dotted Local.* paths, while workflow initialization still drops those same variables when seding default state. That leaves a real class of declarative workflows partially broken even after this PR.

Flagged Issues

  • The null-PropertyPath workaround is only applied in IWorkflowContextExtensions. DeclarativeWorkflowBuilder still calls state.Initialize(...) during startup (DeclarativeWorkflowBuilder.cs:76), and WorkflowDiagnostics.InitializeDefaults silently skips any variable whose variableDiagnostic.Path.VariableName is null (WorkflowDiagnostics.cs:61-66). Since doted refs like Local.Triage now produce null VariableName, a workflow with a declared default for such a variable will still start blank. The same fallback/remapping needs to be carried into the initialization path.

Automated review by alliscode's agents

…; run approved local AIFunctions

Two regressions hit declarative workflows that use require_approval=true when
the client chains turns via previous_response_id (no conversation_id):

1. AgentFrameworkResponseHandler keyed the AgentSession store solely on
   conversation_id, so when only previous_response_id was present the
   StateBag (which holds ToolApprovalIdMap) was discarded after each turn.
   The next turn then threw 'No approval mapping recorded for wire id ...'
   in InputConverter.ConvertMcpApprovalResponse.

   Fix: fall back to previous_response_id on load and to context.ResponseId
   on save so the response-id chain becomes a valid session key. Conversation
   id remains preferred when present.

2. InvokeFunctionToolExecutor.CaptureResponseAsync only acted on
   FunctionResultContent. In the hosted Foundry path the approval response
   arrives as a ToolApprovalResponseContent with no FunctionResultContent,
   so the local AIFunction never ran and downstream PropertyPath/SendActivity
   consumers (e.g. {Local.RefundResult}) saw empty values.

   Fix: when no FunctionResultContent matches but an approved
   ToolApprovalResponseContent does, look up the registered AIFunction by
   name on agentProvider.Functions and invoke it with the evaluated
   arguments, surfacing the result through the existing assignment path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lpers

Address PR microsoft#5905 review feedback:

* Move the PropertyPath VariableName/NamespaceAlias fallback and 'Local'
  -> 'Topic' scope remap into a shared internal PropertyPathExtensions
  helper. Materializes Segments() once, names the magic 'Local' alias
  as a const, and carries a TODO referencing the tracking issue.

* Apply the same helper in WorkflowDiagnostics.InitializeDefaults so a
  declared default for a dotted variable like 'Local.Triage' is no
  longer silently skipped at workflow startup (closes the gap flagged
  by the reviewer: runtime ReadState/QueueStateUpdateAsync worked but
  state.Initialize did not).

* Restore the previous strict failure mode on namespace alias by
  wrapping GetNamespaceAlias() in Throw.IfNull at call sites so a
  malformed single-segment path keeps failing fast rather than
  silently passing null to State.Get/Set.

All 821 unit tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
alliscode and others added 2 commits May 18, 2026 16:25
Covers the autoSend regression fix: when the agent runs on the workflow conversation with autoSend=false, no AgentResponseUpdateEvent or AgentResponseEvent is added to the context. Also covers autoSend=true (events emitted) and autoSend=false on a non-workflow conversation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SendActivityExecutor previously only emitted the activity text via YieldOutputAsync, which the runtime converts to an AgentResponseEvent. WorkflowSession gates AgentResponseEvent behind includeWorkflowOutputsInResponse, so when a host opts out of summary outputs (the default for AsAIAgent) the SendActivity reply is silently dropped.

Mirror the pattern used by AgentProviderExtensions for autoSend agent invocations: also emit an AgentResponseUpdateEvent, which WorkflowSession yields unconditionally. This makes SendActivity reliably reach chat-protocol clients without requiring includeWorkflowOutputsInResponse = true (which would also duplicate autoSend agent output).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The fallback let a session be keyed by an unbroken previous_response_id chain,
but conversation_id is the right way to thread state across turns: it survives
shared/branched chains (e.g. when another agent generates a response in between)
and is the documented model for stateful clients. Restore conversation_id as the
sole session key and rely on the client to thread it. The InvokeFunctionTool
approval/local-function half of 1baf4af remains.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 7/7 changed files
  • Comments generated: 5

Comment on lines +51 to +57
internal static string? GetNamespaceAlias(this PropertyPath variablePath)
{
string? alias = variablePath.NamespaceAlias;
if (alias is null && variablePath.SegmentCount >= 2)
{
alias = variablePath.Segments().FirstOrDefault().PropertyName;
}
{
null => string.Empty,
string s => s,
_ => result.ToString() ?? string.Empty,
{
string functionName = this.GetFunctionName();
AIFunction? function = agentProvider.Functions?.FirstOrDefault(
f => string.Equals(f.Name, functionName, System.StringComparison.Ordinal));
Comment on lines +268 to +273
if (function is null)
{
return new FunctionResultContent(
this.Id,
$"Function '{functionName}' is not registered with the agent provider.");
}
Comment on lines +27 to +28
public Task AutoSendFalseOnWorkflowConversationSuppressesResponseEventsAsync() =>
this.RunAsync(autoSend: false, conversationId: WorkflowConversationId, expectResponseEvents: false);
/// 2026.2.4.1 returns null for dotted refs like "Local.Triage" even when SegmentCount
/// is 2 and IsValid is true).
/// </summary>
internal static string? GetVariableName(this PropertyPath variablePath)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you verify that using this fix survives checkpointing? i.e. value stored in Local scope gets rehydrated as expected after checkpointing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

.NET workflows Related to Workflows in agent-framework

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants